Conference Proceedings
ClusiVAT: A mixed visual/numerical clustering algorithm for big data
D Kumar, M Palaniswami, S Rajasegarar, C Leckie, JC Bezdek, TC Havens
Proceedings 2013 IEEE International Conference on Big Data Big Data 2013 | Published : 2013
Abstract
Recent algorithmic and computational improvements have reduced the time it takes to build a minimal spanning tree (MST) for big data sets. In this paper we compare single linkage clustering based on MSTs built with the Filter-Kruskal method to the proposed clusiVAT algorithm, which is based on sampling the data, imaging the sample to estimate the number of clusters, followed by non-iterative extension of the labels to the rest of the big data with the nearest prototype rule. Numerical experiments with both synthetic and real data confirm the theory that clusiVAT produces true single linkage clusters in compact, separated data. We also show that single linkage fails, while clusiVAT finds high..
View full abstractGrants
Awarded by ARC
Awarded by Equipment and Facilities scheme (LIEF)
Awarded by Australian Research Council
Funding Acknowledgements
We acknowledge the support from Australian Research Council (ARC) Research Network on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), ARC Linkage project grant (LP120100529) and the ARC Linkage Infrastructure, Equipment and Facilities scheme (LIEF) grant (LF120100129).